Fusion of voice signal information for detection of mild laryngeal pathology
نویسندگان
چکیده
Detection of mild laryngeal disorders using acoustic parameters of human voice is the main objective in this study. Observations of sustained phonation (audio recordings of vocalized /a/) are labeled by clinical diagnosis and rated by severity (from 0 to 3). Research is exclusively constrained to healthy (severity 0) and mildly pathological (severity 1) cases – two the most difficult classes to distinguish between. Comprehensive voice signal characterization and information fusion constitute the approach adopted here. Characterization is obtained through diverse feature set, containing 26 feature subsets of varying size, extracted from the voice signal. Usefulness of feature-level and decision-level fusion is explored using support vector machine (SVM) and random forest (RF) as basic classifiers. For both types of fusion we also investigate the influence of feature selection on model accuracy. To improve the decision-level fusion we introduce a simple unsupervised technique for ensemble design, which is based on partitioning the feature set by k-means clustering, where the parameter k controls the size and diversity of the prospective ensemble. All types of the fusion resulted in an evident improvement over the best individual feature subset. However, none of the types, including fusion setups comprising feature selection, proved to be significantly superior over the rest. The proposed ensemble design by feature set decomposition discernibly enhanced decision-level and significantly outperformed feature-level fusion. Ensemble of RF classifiers, induced from a cluster-based partitioning of the feature set, achieved equal error rate of 13.1 ± 1.8% in the detection of mildly pathological larynx. This is a very encouraging result, considering that detection of mild laryngeal disorder is a more challenging task than a common discrimination between healthy and a wide spectrum of pathological cases.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملExploring Kernels in Svm-based Classification of Larynx Pathology from Human Voice
In this paper identification of laryngeal disorders using cepstral parameters of human voice is investigated. Mel-frequency cepstral coefficients (MFCC), extracted from audio recordings, are further approximated, using 3 strategies: sampling, averaging, and estimation. SVM and LS-SVM categorize preprocessed data into normal, nodular, and diffuse classes. Since it is a three-class problem, vario...
متن کاملEtiologies of Dysphonia in Patients Referred to ENT Clinics Based on videolaryngoscopy
Introduction: Laryngeal dysfunction may be divided into three categories; organic, neurologic and functional disorders. Dysphonia and hoarseness are the most common symptoms and, in some cases, the only signs of laryngeal dysfunction. In differential diagnosis of any type of chronic hoarseness, a neoplastic process must be considered and, thus continuous light video laryngoscopy can provide imp...
متن کاملTowards Voice and Query Data-based Non-invasive Screening for Laryngeal Disorders
Topic of the research is exploration and fusion of non-invasive measurements for an accurate detection of pathological larynx. Measurements for human subject encompass results of a specific survey and information extracted by openSMILE toolkit from several audio recordings of sustained phonation (vowel /a/). Clinical diagnosis, assigned by medical specialist, is a target attribute for binary cl...
متن کاملRole of the Internal Superior Laryngeal Nerve in the Motor Responses of Vocal Cords and the Related Voice Acoustic Changes
Background: Repeated efforts by researchers to impose voice changes by laryngeal surface electrical stimulation (SES) have come to no avail. This present pre-experimental study employed a novel method for SES application so as to evoke the motor potential of the internal superior laryngeal nerve (ISLN) and create voice changes.Methods: Thirty-two normal individuals (22 females and 10 males) par...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Appl. Soft Comput.
دوره 18 شماره
صفحات -
تاریخ انتشار 2014